774 research outputs found

    Breakdowns

    Get PDF
    We study a continuous-time game of strategic experimentation in which the players try to assess the failure rate of some new equipment or technology. Breakdowns occur at the jump times of a Poisson process whose unknown intensity is either high or low. In marked contrast to existing models, we find that the cooperative value function does not exhibit smooth pasting at the efficient cut-off belief. This finding extends to the boundaries between continuation and stopping regions in Markov perfect equilibria. We characterize the unique symmetric equilibrium, construct a class of asymmetric equilibria, and elucidate the impact of bad versus good Poisson news on equilibrium outcomes

    Strategic Experimentation with Poisson Bandits

    Get PDF
    We study a game of strategic experimentation with two-armed bandits where the risky arm distributes lump-sum payoffs according to a Poisson process. Its intensity is either high or low, and unknown to the players. We consider Markov perfect equilibria with beliefs as the state variable. As the belief process is piecewise deterministic, payoff functions solve differential-difference equations. There is no equilibrium where all players use cut-off strategies, and all equilibria exhibit an `encouragement effect' relative to the single-agent optimum. We construct asymmetric equilibria in which players have symmetric continuation values at sufficiently optimistic beliefs yet take turns playing the risky arm before all experimentation stops. Owing to the encouragement effect, these equilibria Pareto dominate the unique symmetric one for sufficiently frequent turns. Rewarding the last experimenter with a higher continuation value increases the range of beliefs where players experiment, but may reduce average payoffs at more optimistic beliefs. Some equilibria exhibit an `anticipation effect': as beliefs become more pessimistic, the continuation value of a single experimenter increases over some range because a lower belief means a shorter wait until another player takes over

    Strategic Experimentation with Poisson Bandits

    Get PDF
    We study a game of strategic experimentation with two-armed bandits where the risky arm distributes lump-sum payoffs according to a Poisson process. Its intensity is either high or low, and unknown to the players. We consider Markov perfect equilibria with beliefs as the state variable. As the belief process is piecewise deterministic, payoff functions solve differential-difference equations. There is no equilibrium where all players use cut-off strategies, and all equilibria exhibit an `encouragement effect' relative to the single-agent optimum. We construct asymmetric equilibria in which players have symmetric continuation values at sufficiently optimistic beliefs yet take turns playing the risky arm before all experimentation stops. Owing to the encouragement effect, these equilibria Pareto dominate the unique symmetric one for sufficiently frequent turns. Rewarding the last experimenter with a higher continuation value increases the range of beliefs where players experiment, but may reduce average payoffs at more optimistic beliefs. Some equilibria exhibit an `anticipation effect': as beliefs become more pessimistic, the continuation value of a single experimenter increases over some range because a lower belief means a shorter wait until another player takes over.Strategic Experimentation; Two-Armed Bandit; Poisson Process; Bayesian Learning; Piecewise Deterministic Process; Markov Perfect Equilibrium; Differential-Difference Equation

    Market Experimentation in a Dynamic Differentiated-Goods Duopoly

    Get PDF
    We study the evolution of prices in a symmetric duopoly where firms are uncertain about the degree of product differentiation. Customers sometimes perceive the products as close substitutes, sometimes as highly differentiated. Firms learn about their competitive environment from the quantities sold and a background signal. As the information of the market outcomes increases with the price differential, there is scope for active learning. In a setting with linear demand curves, we derive firms' pricing strategies as payoff-symmetric mixed or correlated Markov perfect equilibria of a stochastic differential game where the common posterior belief is the natural state variable. When information has low value, firms charge the same price as would be set by myopic players, and there is no price dispersion. When firms value information more highly, on the other hand, they actively learn by creating price dispersion. This market experimentation is transient, and most likely to be observed when the firms' environment changes sufficiently often, but not too frequently.Duopoly experimentation, Bayesian learning, stochastic differential game, Markov-perfect equilibrium, mixed strategies, correlated equilibrium.

    Strategic Experimentation with Exponential Bandits

    Get PDF
    This paper studies a game of strategic experimentation with two-armed bandits whose risky arm might yield a payoff only after some exponentially distributed random time. Because of free-riding, there is an inefficiently low level of experimentation in any equilibrium where the players use stationary Markovian strategies with posterior beliefs as the state variable. After characterizing the unique symmetric Markovian equilibrium of the game, which is in mixed strategies, we construct a variety of pure-strategy equilibria. There is no equilibrium where all players use simple cut-off strategies. Equilibria where players switch finitely often between the roles of experimenter and free-rider all lead to the same pattern of information acquisition; the efficiency of these equilibria depends on the way players share the burden of experimentation among them. In equilibria where players switch roles infinitely often, they can acquire an approximately efficient amount of information, but the rate at which it is acquired still remains inefficient; moreover, the expected payoff of an experimenter exhibits the novel feature that it rises as players become more pessimistic. Finally, over the range of beliefs where players use both arms a positive fraction of the time, the symmetric equilibrium is dominated by any asymmetric one in terms of aggregate payoffs

    Strategic Experimentation: The Case of the Poisson Bandits

    Get PDF
    This paper studies a game of strategic experimentation in which the players learn from the experiments of others as well as their own. We first establish the efficient benchmark where the players co-ordinate in order to maximise joint expected payoffs, and then show that, because of free-riding, the strategic problem leads to inefficiently low levels of experimentation in any equilibrium when the players use stationary Markovian strategies. Efficiency can be approximately retrieved provided that the players adopt strategies which slow down the rate at which information is acquired; this is achieved by their taking periodic breaks from experimenting, which get progressively longer. In the public information case (actions and experimental outcomes are both observable), we exhibit a class of non-stationary equilibria in which the ε\varepsilon-efficient amount of experimentation is performed, but only in infinite time. In the private information case (only actions are observable, not outcomes), the breaks have two additional effects: not only do they enable the players to finesse the inference problem, but also they serve to signal their experimental outcome to the other player. We describe an equilibrium with similar non-stationary strategies in which the ε\varepsilon-efficient amount of experimentation is again performed in infinite time, but with a faster rate of information acquisition. The equilibrium rate of information acquisition is slower in the former case because the short-run temptation to free-ride on information acquisition is greater when information is public.

    Strategic Experimentation with Exponential Bandits

    Get PDF
    This paper studies a game of strategic experimentation with two-armed bandits whose risky arm might yield a payoff only after some exponentially distributed random time. Because of free-riding, there is an inefficiently low level of experimentation in any equilibrium where the players use stationary Markovian strategies with posterior beliefs as the state variable. After characterizing the unique symmetric Markovian equilibrium of the game, which is in mixed strategies, we construct a variety of pure-strategy equilibria. There is no equilibrium where all players use simple cut-off strategies. Equilibria where players switch finitely often between the roles of experimenter and free-rider all lead to the same pattern of information acquisition; the efficiency of these equilibria depends on the way players share the burden of experimentation among them. In equilibria where players switch roles infinitely often, they can acquire an approximately efficient amount of information, but the rate at which it is acquired still remains inefficient; moreover, the expected payoff of an experimenter exhibits the novel feature that it rises as players become more pessimistic. Finally, over the range of beliefs where players use both arms a positive fraction of the time, the symmetric equilibrium is dominated by any asymmetric one in terms of aggregate payoffs.Strategic Experimentation ; Two-Armed Bandit ; Exponential Distribution ; Bayesian Learning ; Markov Perfect Equilibrium ; Public Goods

    Strategic Experimentation: The Case of Poisson Bandits

    Get PDF
    This paper studies a game of strategic experimentation in which the players have access to two-armed bandits where the risky arm distributes lumpsum payoffs according to a Poisson process with unknown intensity. Because of free-riding, there is an inefficiently low level of experimentation in any equilibrium where the players use stationary Markovian strategies. We characterize the unique symmetric Markovian equilibrium of the game, which is in mixed strategies. A variety of asymmetric pure-strategy equilibria is then constructed for the special case where there are two players and the arrival of the first lump-sum fully reveals the quality of the risky arm. Equilibria where players switch finitely often between the roles of experimenter and free-rider all lead to the same pattern of information acquisition; the efficiency of these equilibria depends on the way players share the burden of experimentation among them. We show that at least for relatively pessimistic beliefs, even the worst asymmetric equilibrium is more efficient than the symmetric one. In equilibria where players switch roles infinitely often, they can acquire an approximately efficient amount of information, but the rate at which it is acquired still remains inefficient.strategic experimentation, two-armed bandit, poisson process, Bayesian learning, Markov perfect equilibrium, public goods

    Price Dispersion and Learning in a Dynamic Differentiated-Goods Duopoly

    Get PDF
    We study the evolution of prices set by duopolists who are uncertain about the perceived degree of product differentiation. Customers sometimes view the products as close substitutes, sometimes as highly differentiated. As the informativeness of the quantities sold increases with the price differential, there is scope for active learning by firms. When information has low value to the firms, they charge the same price as would be set by myopic players, and there is no price dispersion. When firms value information more highly, on the other hand, they actively learn by creating price dispersion. Such price dispersion arises in a cyclical fashion, and is most likely to be observed when the firms' environment changes sufficiently often, but not too frequently. Firms' payoffs are higher when they use correlated pricing strategies. Contrary to what one might expect, such coordination need not hurt consumers, provided they are sufficiently impatient

    Undiscounted Bandit Games

    Get PDF
    We analyze continuous-time games of strategic experimentation with two-armedbandits when there is no discounting. We show that for all specifications of priorbeliefs and payoff-generating processes that satisfy some separability condition, the unique symmetric Markov perfect equilibrium can be computed in a simple closed form involving only the expected current payoff of the risky arm and the expected full-information payoff, given current information. The separability condition holds in a variety of models that have been explored in the literature, all of which assume that the risky arm’s expected payoff per unit of time is time-invariant and actual payoffs are generated by a process with independent and stationary increments. The separability condition does not hold when the expected payoff per unit of time is subject to state-switching
    • …
    corecore